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SOME QUANTITATIVE STUDIES IN 
EPIDEMIOLOGY. 

N account of some quantitative studies in epidemiology 
has recently been published in the second edition of 
my book on the “ Prevention of Malaria ” (Murray), and 
the Editor of Nature has asked me to give a general 
description of them here. The attempts originated in the 
following manner. Shortly after Anophelines were shown 
to carry malaria, it was often observed that little apparent 
correlation could be found between their numbers and the 
numbers of infected persons in a locality. The observa¬ 
tions were always far too scanty to establish any real 
absence of correlation; but they were used, nevertheless, to 
support the thesis that the amount of malaria does not 
depend upon the number of the Anophelines, and that 
therefore the proposed anti-malarial measure of mosquito 
reduction (then very unpopular) was useless. For many 
reasons a trustworthy experimental investigation would 
have been very difficult and costly, and it was therefore 
all the more necessary to examine the subject by a care¬ 
fully ' reasoned analysis of the relations which must hold 
between the amount of the disease and the various factors 
which influence it. My first attempt in this direction was 
made in an official report on the “ Prevention of Malaria 
in Mauritius ” (Waterlow and Sons, 1908), and fell into 
the form of a simple difference equation. This was further 
developed in the first edition of my book already mentioned, 
and the subject was at the same time ably attacked by Mr. 
H. Waite, at the instance of Prof. Karl Pearson, in 
Biometrika , October, 1910. 

The attempt now referred to aims at extending the 
reasoning to infectious diseases in general. The object is 
as follows. Suppose that a given proportion of a popula¬ 
tion in a given locality at a given moment are infected 
with some disease. Then we know from experience that 
the number will not remain fixed, but will vary from time 
to time and from place to place. The problem is to calcu¬ 
late these variations on the supposition that all the 
coefficients are known, which, of course, is by no means 
always the case. The use of the calculation will be (1) to 
obtain more light regarding the coefficients by comparing 
calculated with observed results; (2) to obtain quantitative 
estimates as to how far each coefficient should affect the 
result; and (3) to improve preventive measures by showing 
which factors they should be directed against. My studies 
have been hitherto concerned only with time-to-time varia- 
fions, and the reader will understand that they require 
verification and completion by better mathematicians than 
myself. So far as I can ascertain, the subject has been 
little dealt with hitherto. 

We must first obtain clear ideas on some points. 
Infectedness is not the same thing as sickness. Infected¬ 
ness begins when the infecting organisms first enter the 
body of the host (man, animal, or plant), and ceases only 
when the last of them die out of him or leave him, or 
when he himself dies. Sickness may be quite absent 
during the whole of this period, or may begin after an 
“incubation period ”; may cease long before or long 
after infectedness ceases, or may be intermittent. It is 
therefore merely an episode of infectedness, and one which 
does not concern us greatly just now. Another episode, 
and a more important one at the moment, is infective¬ 
ness, that is, the state of the infected person during which 
the infecting organisms are able to pass from him to 
others. The period or periods of infectiveness are always 
contained within the period of infectedness, but do not 
necessarily coincide with the periods of sickness. Thus 
typhoid or diphtheria carriers may be ill for only a week 
or so, or not at all, but may remain infective for months. 
In yellow fever, according to good researches, sickness 
and infectiveness begin together a few days after the com¬ 
mencement of infectedness at inoculation ; but infectiveness 
ceases three days later, often long before the sickness is 
over. In malaria, sickness and infectiveness are inter¬ 
mittent and not coincident episodes, and may recur for 
years. Infectedness itself is only the preliminary stage 
of affectedness, which begins at inoculation and does not 
end until the last trace of the resulting sickness or 
acquired immunity has vanished. Reinfection often occurs 
during existing affectedness, and may increase its dura- 
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tion and that of the episodes. Medical treatment may 
have the opposite effect, and natural immunity and pre¬ 
vention may reduce susceptibility to infection. Lastly, the 
natural fluctuations of population, due to births, deaths, 
immigration, and emigration, must be considered, and 
these may vary in consequence of the epidemic. 

Hence many coefficients have to be taken into account; 
and the principal difficulty lies, I fancy, in arranging for 
all of them in the equations. The course which I have 
adopted as being perhaps the best for a beginning is to 
conceive the matter in the most general terms possible 
by taking the act of infection as being one of any kind 
of event, such as accident, death, marriage, bankruptcy, 
receipt of bequests, insect-bite, &c., which may occur to 
a population, the various coefficients being at present taken 
as constant during the period considered. If such an 
event occurs to a given constant proportion of the popula¬ 
tion in unit of time, how many affected people will there 
be in the locality on a given date, on a most probable 
estimate, and how many of these have been affected once, 
twice, thrice, &c. ? This simple form may be called the 
problem of happenings , and its solution will often be 
useful in epidemiology, as, for instance, in estimating the 
most probable frequency of reinfections or of insect-bites. 
But for some kinds of events, such as marriage, wealth, 
and infectedness, we must contemplate a continuance of 
the event in the individual, with a possible reversion to 
the unaffected class after the cessation of affectedness. 
Such events may be called becomings; and we have now 
to find the proportion of the population in this condition 
on a given date. 

I will treat the equations as briefly as possible. 
Consider the following :— 

a t+1 = (1 - h)va t + HVs f 

z t + 1 = h vat + (1 - H )Vz t 

fit+i — vat+Vzt. ..(1) 

Here a t and z t are respectively the numbers of unaffected 
and affected individuals, and p t is the total population at 
the end of t units of time; v and V are respectively the 
variations in number of the unaffected and the affected 
due to births, deaths, immigrations, and emigration in 
unit of time; h is the proportion of the unaffected which 
become affected, and H the proportion of the affected 
which become unaffected (to be better defined presently) 
in unit of time. Thus i — h and 1 —H are respectively 
the proportions which remain unaffected and which re¬ 
main affected, and a^ and are the numbers of the 
groups after the lapse of one unit of time. The gain of 
one group is the loss of the other group, and the total 
population is the sum of the two groups, the factors h 
and H disappearing in the summation. 

If n, m, i, e denote the (constant) nativity, mortality, 
immigration, and emigration rates among the unaffected, 
and N, M, I, E the similar rates among the affected, it 
is correct, I believe, to write w=-(i+«)(i — m)(i+i)(i — e ), 
and a similar equation for V. Different symbols are 
necessary for the two groups, because all the quantities, 
even the immigration, may differ. We now take the 
equations in more exact detail, but omitting v and V for 
the moment. Thus 

a t+1 = (l - h)a t . + (l - k)na t + (i — /z)Nsf + (1 ~h)rz t 

z t+1 = h a f + h na t + h N^-f- h rz t + ( t ~ r)z t 

pt+\ — ci t + na t -¥ Ns f + z t . . . (2 

Here n and N are the birth-rates of the two groups. 
The second and third columns give the happenings among 
the births; rz t is the proportion of the affected which 
revert to the unaffected group in unit of time, and hrz t 
the (very small) proportion of these which immediately 
become reaffected; (1 — r)z t is the proportion of the affected 
which do not revert, and (i — h)iz t the proportion of the 
reverted which are not immediately reaffected. Obviously 
p t+1 is merely the sum of the two groups a t and z t plus 
the births that have occurred to both in the unit of time, 
and the symbols h and r disappear in the summation. 
The equations are not symmetrical, because, though the 
progeny of the unaffected are born in this group and belong 
to it, the progeny of the affected are not born affected, 
and therefore do not belong to the latter group. I think 
that this is the better arrangement; but it would be possible 
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to add a term for affected births, as in syphilis. The first 
two of the above equations may be written 

ctt+i— (1 - k)(i A-n)at + (1 — 

s f+1 = h{\ +n)a t + 1 - (i -- h) 1 ^ (* + N)s* (3) 

If, now, we restore the mortality, immigration, and 
emigration rates, that is, affix to a t in both equations the 
coefficient (1 — ro)(i-b*)( I — e) and to z t the coefficient 
(1 — N)(i-fl)(x — E), we have 

a t+l - (I - h)va t + (1 - 

z t +i= h va t + |i -(1 ; + jj} Vzf ■ ■ • ( 4 ) 


which are obviously the same as equations (i) if H is 
now defined as the value of (i —fe)(N + r)/(i + N). 

The complete solution of these difference equations is 


(X - Y)a t = (a 1 - « 0 Y)X e - (a x - a 0 X)Y« 

(X - Y)zj = (% - z 0 Y)X 4 - (s, - %X)Y 4 
(X-Y)/ t =(A-AY)X‘-(A-/ 0 X)Y e • • • (5) 

where 

a 1 = ( 1 - k)va e r HV% Sj = hva t + (I — H)Vs 0 
P\~ Va \) + • A> =a O + S# 

and X and Y are the roots of the auxiliary algebraic 
quadratic equation 

x "‘ - {{i ~/i)v + ( 1 - H)V' f zr+ (1 - h - H)r/V = o. 

These roots are rational for several particular values of 
the constants. The most important instance is when 
V=V, that is, when the happening does not affect the 
normal fluctuations of the population. Here X = v and 
Y = (i— h)(i — r)/(i + N), and 


st- Y% = 


A(i+N) 


iS ! r -r h - ■ hr 


(A~Y%). 


( 6 ) 


As Y is in this case less than unity, Y t diminishes with¬ 
out limit as t increases, and therefore Zt y the number of 
affected individuals, asymptotes to a fixed proportion of 
the total population, provided that all the elements remain 
constant. I call this proportion the static value. In 
disease it gives what is called the endemic index , or 
ratio. 

In epidemiological applications the symbol z refers, not 
to sickness or even infectedness, but to affectedness as 
defined above; and the symbol r does not mean recovery 
from sickness or infectedness, but reversion to a suscepti¬ 
bility to a fresh happening (inoculation), that is, to loss 
of acquired immunity. Thus in drawing curves of 
epidemics we must remember that this last factor may 
not come into play until long after the commencement of 
the epidemic, or not at all. 

In my book the above equations are treated also in the 
infinitesimal form, when the integrals become exponential. 
Thus the second of equations (2) becomes 

~ = h(p-z) + qz, 


where q = V— 1— r — N. If the total population p remains 
constant, this is easily integrable if h is also constant, or 
(what more probably happens in epidemics) is a linear 
function of z , say cz . 

Numerous applications are possible; but I have space 
to refer only to the important case of “ metaxenous 
diseases,** that is, to infections common to two species of 
animals or plants. The same equations apply to both 
species, but the happening-factor h in one equation must 
be a function of z in the other equation. We thus have 
two simultaneous equations to solve, namely, 

f r k'd(p-z) +q z 


dz r 

Tt 


T-kz(p' - -r-q'r. 


where the marked symbols apply to one species of animals 
(say, mosquitoes) and the unmarked ones to the other 
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species (say, man), and k and fe' are constants composed 
of the most probable frequencies of communication 
between the two species, of infectivity and of natural 
immunity. Prof. F. S. Carey has referred these equa¬ 
tions to Prof. A. R. Forsyth, who thinks that they are 
not likely to be easily integrable in finite terms.; but the 
most important case is where both z and z' have reached 
static values, when the differential coefficients vanish. 
We then obtain at once 

_ hpll p' — qq' 

hk'pb — kq 

with the similar equation for z'. In the case of some 
insect-borne diseases this becomes (reduced) 

z{(i ~ r)fbfb'K + rfb'\ —plfbfb'A - rN'}, 

where s is the ratio of affectedness among men (say), 
/ and f the proportion of infective men and insects, b' the 
frequency of bites, r the reversion rate among the human 
patients, N' the birth-rate of the insects, and A the ratio 
of the number of the insects to head of human popula¬ 
tion. Numerical estimates of the constants in malaria are 
attempted in the book, and a table of calculated values 
of A for various values of z and b' are given (as already 
partly done by Mr. Waite). 

The following important laws seem to be established :—■ 
(1) the disease (z) will not maintain itself unless the pro¬ 
portion of Anophelines (A) is sufficiently large; (2) a small 
increase of A above this figure will cause a large increase 
of z; and (3) z will tend to reach a fixed value, depending 
on A and the other constants. I doubt whether these 
laws could have been reached except by such mathe¬ 
matical attempts. The second one is especially important. 
If A is just at the critical value, z will be zero, or only 
just above it; but if A is only about double this critical 
value, a serious epidemic, amounting to about half the 
whole population, may follow. Yet such a small increase 
in the number of Anophelines will scarcely be detectable 
except after very careful study, a fact which easily explains 
why marked correlation has not always been observed. 
The same equation shows that, if certain experiments are 
to be trusted, yellow fever can scarcely be considered an 
endemic disease of men at all; and it also explains the 
absence of certain diseases in the presence of capable 
carriers, and the general phenomena of smouldering 
epidemics. 

The most probable numbers of individuals to which a 
happening has occurred never, once, twice, &c., can 
easily be obtained, and are equal to the successive terms 
in the expansion of jt I - h)■* h 'in ascending powers 
of h. This enables us to estimate the number of persons 
who have been bitten, or the number of insects which 
have succeeded in biting never, once, twice, &c., in a 
given period, and to calculate the average number of bites 
received or inflicted by each individual. It also enables 
us to calculate (what I think has not been, done before) 
the frequency of reinfections. At present such reinfec¬ 
tions are not much considered during the course of an 
already existing infection, but I estimate that in a locality 
where half the people are statically affected with malaria 
no fewer than about 63 per cent, will be infected or re¬ 
infected every four months (under constant conditions). 
In 1898 I showed that birds reinoculated with malaria 
could exhibit renewed and severe infections. 

Lastly, to complete the study, it is necessary to estimate 
the most probable proportion of affected individuals who 
are also infected, or infective, or sick at a given moment. 
This will be the same as the proportion of the average 
number of days lived during these “ episodes ” to the 
average number lived during the whole period of “ affected¬ 
ness,” which can be calculated from the special patho¬ 
logical data. 

These studies require to be developed much further; 
but they will already be useful if they help to suggest a 
more precise and quantitative consideration of the 
numerous factors concerned in epidemics. At present 
medical ideas regarding these factors are generally so 
nebulous that almost any statements about them pass 
muster, and often retard or misdirect important preventive 
measures for years. Ronald Ross. 
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